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[Name of the Document] Specification 

[Title of the Invention] Image retrieving apparatus, image retrieving 
method and recording medium for recording program to implement the 
image retrieving method 
5 [Claims] 

[Claim l] An image retrieving apparatus for retrieving an image 
similar to a predetermined query image out of subject videos for retrieval, 
characterized in that 

said image retrieving apparatus, comprising- 
10 a frame feature vector extracting means for extracting a 

feature vector of at least a part of frames included in said subject videos 
for retrieval, and for outputting said extracted one as a frame feature 
vector; 

a frame feature vector storing means for storing the frame 
15 feature vector outputted by said frame feature vector extracting means; 

an image feature vector extracting means for extracting a 
feature vector of said query image and for outputting said extracted one 
as an image feature vector; 

a similarity calculating means for comparing the firame 
20 feature vector stored in said frame feature vector storing means with the 
image feature vector outputted by said image feature vector extracting 
means to thereby calculate a similarity of both vectors; 

a frame feature vector integrating means for integrating 
frame feature vectors out of those stored in said frame feature vector 
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storing means that satisfy a predetermined condition on similarity into 
at least one group; and, 

a similar image selecting means for selecting at least one 
frame feature vector of a highest similarity out of tTie group integrated by 
5 said frame feature vector integrating means, 

whereby images having the frame feature vector that is 
selected by said similar image selecting means is presented as a result of 
retrieval. 

10 [Claim 2] The image retrieving apparatus as set forth in claim 1, 
characterized in that 

said frame feature vector integrating means comprises- 
a frame feature vector selecting means for selecting a 
frame feature vector of a similarity that is calculated by said similarity 
15 calculating means and is higher than a predetermined threshold value, 
out of frame feature vectors stored in said frame feature vector storing 
means; and 

a similar segment generating means for integrating frame 
feature vectors that are continuous in time, out of the frame feature 
20 vectors selected by said frame feature vector selecting means, into one 
group and for outputting the integrated group. 

[Claim 3] An image retrieving apparatus for retrieving a video 
segment similar to a predetermined query video, out of subject videos for 
25 retrieval, characterized in that 

said image retrieving apparatus, comprising: 

2 
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a frame feature vector extracting means for extracting a 
feature vector of some or all frames, out of the subject videos for retrieval 
and for outputting the extracted one as a frame feature vector,' 

a frame feature vector storing means for storing the frame 
5 feature vector outputted by said frame feature vector extracting means; 

a video feature vector extracting means for extracting a 
feature vector of some or aU frames included in said query video, and for 
outputting the extracted one as a first video feature vector; 

a video feature vector cutout means for cutting out a 
10 frame feature vector corresponding to a time length that the query video 
inputted by said video feature vector extracting means has, out of the 
frame feature vectors stored in said' fram^ feature vector ^ring means, 
and for outputting the cutout one as a second video feature vector^ 

a similarity calculating means for comparing said first 
15 video feature vector outputted by said video feature vector extracting 
means with said second video feature vector outputted by said video 
feature vector cutout means, to thereby calculate a similarity of both 
vectors; 

a video feature vector integrating means for integrating 
20 the second video feature vectors, out of those outputted by said video 
feature vector cutout means, that satisfy a predetermined condition on 
similarity into at least one group; and 

a similar video segment selecting means for selecting at 
least one of the second video featiure vector that has a highest similarity 
25 in the group integrated by said video feature vector integrating means, 
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whereby a video segment having the second video feature 
vector selected by said similar video segment selecting means is 
presented as a result of retrieval. 

5 [Claim 4] The image retrieving apparatus as set forth in claim 3, 
characterized in that 

said video feature vector integrating means, comprising- 
a video feature vector selecting means for selecting a 
second video feature vector of which a similarity calculated by said 
10 similarity calculating means is higher than a predetermined threshold 
value, out of second video feature vectors outputted by said video feature 
vector cutout means; and " ~' ~ 

a similar segment generating means for integrating the 
second video feature vectors that are either continuous in time or 
15 partially duplicate, out of those selected by said video feature vector 

selecting means into one group, and for outputting the integrated group. 

[Claim 5] The image retrieving apparatus as set forth in anyone of 
the claims 1 through 4, characterized in that 
20 said frame feature vector extracting means generates a 

resized image for at least a part of frames included in said subject videos 
for retrieval, and extracts a frame feature vector by appljdng a frequency 
conversion and a quantizing processing to said resized image. 
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[Claim 6] An image retrieving method of retrieving an image 
similar to a predetermined query image out of subject videos for retrieval, 
characterized in that 

said image retrieving method, comprising the sequential 

5 steps of 

extracting a frame feature vector of at least a part of 
frames included in said subject video$ for retrieval; 

storing said extracted frame feature vector; 
extracting an image feature vector of said query image; 
10 comparing said frame feature vector with said image feature vector to 
thereby calculate a similarity of both vectors; 

integrating frame feature vectors of which the similarities 
satisfy a predetermined condition on similarity into at least one group; 

selecting at least one frame feature vector of the highest 
15 similarity in said integrated group; and 

proposing an image having said selected frame feature vector as a result 
of retrieval. 

[Claim?] The image retrieving method as set forth in claim 6, 

20 characterized in that 

the integration of said frame featxire vectors into said 
group is implemented in such a manner that the frame feature vectors of 
which the similarities are higher than a predetermined threshold value 
are selected, and that out of said selected frame feature vectors, those 

25 that are continuous in time are integrated into one group. 
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[Ciaiin 8] An image retrieving method of retrieving a video segment 
similar to a predetermined query video out of subject videos for retrieval, 
characterized in that 

said image retrieving method, comprising the sequential 

5 steps of- 

extracting at least a part of frame feature vectors 
included in said subject videos for retrieval; 
storing extracted said frame feature vectors; 

extracting a video feature vector of at least a part of 
10 frames included in said query video; 

cutting out a video feature vector of a frame 

corresponding to a time length that said query video has, aut-of said- - -J 

frame feature vectors? 

comparing said video feature vector extracted from said ; 
15 query video with the video feature vector cut out from said frame feature 
vectors, to thereby calculate a similarity of both vectors; 

integrating video feature vectors of which said similarities 
satisfy a predetermined condition, out of the video feature vectors cut out j 

from said frame feature vectors into at least one group; 
20 selecting at least one video feature vector of a highest 

similarity in said integrated gi'oup; and 

proposing a video segment having said selected video feature vector as a 
result of retrieval. 

i 

25 [Claim 9] The image retrieving method as set forth in claim 8, 
characterized in that 
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the integration of said video feature vectors into said 
group is implemented by the process that the video feature vectors of 
-which said similarities are higher than a predetermined threshold value 
are selected, and those that are either continuous in time or partly 
5 duplicate in the selected video feature vectors are integrated into one 
group. 

[Claim 10] The image retrieving method as set forth in anyone of the 
claims 6 through 9, characterized in that 
10 said extracted frame feature vector is extracted in such a 

manner that a resized image is produced for at least a part of frames 

included in said subject video for retrieval, and that a frequency 

conversion and a quantizing processing are applied to said resized image. 

15 [Claim 11] A recording medium, characterized in that 

the image retrieving method as set forth in anyone of the 
claims 6 through 10 is written therein. 

[Detailed Explanation of the Invention] 
• [0001] 

20 [Technical Field to which the Invention pertains] 

The present invention relates to an image retrieving apparatus 
and an image retrieving method. More particularly, the present 
invention relates to an apparatus for and a method of retrieving an 
image similar to a predetermined query image out of videos. 
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. [0002] 
[Prior Art] 

Hitherto, in an image retrieving apparatus having a video data 
base for storing video data, an image retrieving method is adapted in 
5 which image data similar to either a predetermined image (it will be 
hereinafter referred to as a query image) or a predetermined video 
segment (it will be hereinafter referred to as a query video segment) is 
retrieved out of image data stored in the video data base. 

[0003] 

10 In one typical method of such image retrieving method, a query 

image and all frames of videos are compared and the images are sorted 
in decreasing order according to their similarities. However, in this 
image retrieving method, too much images are presented as candidates, 
and therefore it takes a long time to implement image retrieval. 

15 [0OO4] 

Thus, in Laid-open Japanese Patent Publication No, 11-259061, a 
different method has been disclosed in which a change in an image scene 
usually referred to as a scene-change is preliminarily detected out of the 
stored videos, and only every one frame immediately after the scene- 
20 change are stored as representative frames. Then, retrieval process is 
implemented to retrieve a similar image only out of the stored 
representative scenes instead of entire frames involved in video data. 

[0005] 
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[Problem to be solved by the Invention] 

However, the image retrieving method according to the prior art 
must encounter many problems as described below, 

[0006] 

5 Namely, in the method of proposing image data nominated 

according to a lower to higher similarity by comparing a query image 
with all frames of a video, since the video is a set of frames continuing in 
time, the continuing respective frames, in general, are quite similar to 
one another in theii- contents. Thus, the continuous frames involved in a 
10 certain shot are eventually nominated and presented, and accordingly 
the nximber of nominated and presented images increases, while causing 
a problem of necessitating a lot of time for completion of retrieval of an 
image. 

[0007] 

15 In the method disclosed in the Laid-open Japanese Patent 

Pubhcation No. 11-259061, retrieval of a query image is implemented to 
retrieve it out of only a part of frames such as frame images obtained by 
the .detection of scene -change, and therefore frames contained within a 
scene are not retrieved. Thus, retrieval cannot be implemented with 

20 every frame unit. At this stage, if a certain scene contains quite a lot of 
motion activity, there might be a case where the content of the first 
frame in a scene is greatly different from those of the respective frames 
within the scene. In this case, a problem might occur in which a desired 
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frame is not included in the representative frames, which are subjected 
to retrieving process. 

On the other hand, from the time of detection of a scene-change, 
it may be possible to implement further retrieving of a query image out of 
5 the respective frames within the scene. Nevertheless, if it fails to detect a 
scene-change of a scene containing therein a desired image, the desired 
image cannot be included in the subject of retrieving, and as a result, 
retrieving of the desired image cannot be eventually implemented. 

[0008] 

10 Therefore, the present invention was made in view of the afore- 

mentioned various problems of the prior art. 

Namely, an object of the present invention is to provide an image 
retrieving apparatus for and an image retrieving method of retrieving an 
image in which the number of similar images nominated and presented 
15 is controlled while implementing the retrieving of the similar images by 
the xuiit of frame. 

Another object of the present invention is to provide a recording 
medium in which the above-mentioned retrieving method is written. 

[0009] 

20 [Means for solving Problem] 

An image retrieving apparatus according to the present invention, 
which is an apparatus for retrieving an image similar to a predetermined 
query image out of subject videos to be retrieved, comprises- 
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a frame feature vector extracting means for extracting a feature 
vector of at least a part of frames included in the subject videos for 
retrieval, and for outputting the extracted one as a frame feature vector; 
a frame feature vector storing means for storing the frame feature vector 
5 outputted by the frame feature vector extracting means; 

an image feature vector extracting means for extracting a feature 
vector of the query image and for outputting the extracted one as an 
image feature vector; 

a similarity calculating means for comparing the frame feature 
10 vector stored in the frame feature vector storing means with the image 
feature vector outputted by the image feature vector extracting means to 
thereby calculate the similarity of both vectors; 

a frame feature vector integrating means for integrating frame 
feature vectors out of those stored in the frame feature vector storing 
15 means that satisfy a predetermined condition on similarity into at least 
one group; and, 

a similar image selecting means for selecting at least one frame 
featiare vector of the highest similarity, out of the group integrated by the 
frame feature vector integrating means, 
20 whereby an image having the frame feature vector that is 

selected by the similar image selecting means is presented as a result of 
retrieval 

[0010] 

Further, the frame feature vector integrating means is 
25 characterized by comprising^ 
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a frame feature vector selecting means for selecting a frame 
feature vector of a similarity that is calculated by the similarity 
calculating means and is higher than a predetermined threshold value, 
out of frame feature vectors stored in the frame feature vector storing 
5 means; and 

a similar segment generating means for integrating frame 
feature vectors that are continuous in time, out of the frame feature 
vectors selected by the frame feature vector selecting means into one 
group and for outputting the integrated group. 

10 [0011] 

Furthermore, an image retrieving apparatus for retrieving a 
video segment similar to a predetermined query video out of subject 
videos for retrieval comprises- 

a frame feature vector extracting means for extracting a feature 
15 vector of at least a part of frames, out of the subject videos for retrieval, 
and for outputting the extracted one as a frame feature vector; 

a frame feature vector storing means for storing the frame 
feature vector outputted by the frame feature vector extracting means; 

a video feature vector extracting means for extracting a feature 
20 vector of at least a part of frames included in a query video, and for 
outputting the extracted one as a first video feature vector; 

a video feature vector cutout means for cutting out a frame 
feature vector corresponding to a time length that the query video 
inputted by the video feature vector extracting means has, out of the 
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frame feature vectors stored in the frame featxire vector storing means, 
and for outputting the cutout one as a second video feature vector^ 
a similarity calculating means for comparing the first video 
feature vector outputted by the video feature vector extracting means 
5 with the second video feature vector outputted by the video feature 
vector cutout means to thereby calculate a similarity of the compared 
both vectors; 

a video feature vector integrating means for integrating the 
second video feature vectors out of those outputted by the video feature 
10 vector cutout means that satisfy a predetermined condition on similarity 
into at least one group i and 

a similar image selecting mean& for selecting at least one of the 
second video feature vector that has the highest similarity in the group 
integrated by the video feature vector integrating means, 
15 whereby an image having the second video feature vector selected 

by the similar image selecting means is presented as a result of retrieval. 

[0012] 

Further, the video feature vector integrating means is 
characterized by comprising- 
20 a video feature vector selecting means for selecting a second video 

feature vector of which a similarity calculated by the similarity 
calculating means is higher than a predetermined threshold value, out of 
second video feature vectors outputted by the video featvure vector cutout 
means; and 



13 



FROM 



2007^ 3^ 2B(^) 18:27/S?»18:20/?C»#^4801505615 



a similar segment generating means for integrating the second 
video feature vectors that are either continuoxis in time or partially 
duphcate, out of those selected by. the video feature vector selecting 
means into one group, and for outputting the integrated group, 

5 [0013] 

Further, the frame feature vector extracting means is 
cheiracterized in that it generates a resized image for at least a part of 
frames included in the subject videos for retrieval, and extracts a frame 
feature vector by applying a frequency conversion and a quantizing 
10 processing to the said resized image. 

[0014] 

An image retrieving method according to the present invention, 
which is a method of retrieving an image similar to a predetermined 
query image out of subject videos for retrieval, is characterized by 
15 sequentially implementing' 

a process for extracting a frame feature vector of at least a part of 
frames included in the subject videos for retrieval; 

a process for storing the extracted frame feature vector! 

a process for extracting an image feature vector of the query 

20 image; 

a process for comparing the frame feature vector with the said 
image feature vector to thereby calculate similarity of both feature 
vectors! 



14 



FROM m^lBIg|#S^*HIF/r 



2007^ 3^ 2B (&) 18; 28/^1 8: 20/5C»fi^4801 505615 



a process for integrating the frame feature vectors of which the 
similarities satisfy a predetermined condition on similarity into at least 
one groupJ . 

a process for selecting at least one frame feature vector of the 
5 highest similarity in the integrated groups and 

a process for proposing an image having the selected frame 
feature vector as a result of retrieval. 

[0015] 

Further, the integration of the frame feature vectors into the 
10 group is characterized in that the frame feature vectors of which the 

similarities are higher than a predetermined threshold value are selected, 
and out of the selected frame feature vectors, those that are continuous 
in time are integrated into one group. 

[0016] 

15 Further, an image retrieving method of retrieving a video 

segment similar to a predetermined query-video out of subject videos for 
retrieval is characterized by sequentially implementing- 

a process for extracting at least a part of frame feature vectors 
included in the subject videos for retrieval] 
20 a process for storing extracted frame feature vectors; 

a process for extracting a video feature vector of at least a part of 
frames included in the query videoJ 

a process for cutting out a video feature vector of a frame 
corresponding to a time length that the query video has, out of the frame 
25 feature vectors! 
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a process for comparing the video feature vector extracted from 
the query video with the video feature vector cut out from the frame 
feature vectors to thereby calculate the similarity of both feature vectors? 

a process for integrating video feature vectors of which the 
5 similarities satisfy a predetermined condition, out of the video feature 
vectors cut out from the frame feature vectors into at least one group; 

a process for selecting at least one video feature vector of the 
highest similarity in the integrated group; and 

a process for proposing an image having the selected video 
10 feature vector as a result of retrieval. 

[0017] 

Furthermore, the integration of the video feature vectors into the 
group is characterized by implementing the process that the video 
feature vectors of which the similarities are higher than a predetermined 
15 threshold value are selected, and those that are either continuous in time 
or partly duplicate m the selected video feature vectors are integrated 
into one group. 

[0018] 

Still further, the frame feature vector is characterized in that a 
20 resized image is produced for at least a part of frames included in the 

subject videos for retrieval, and a frequenc}'' conversion and a quantizing 
processing are applied to the resized image. 

[0019] 
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A recording medium according to the present invention is 
characterized in that a program permitting a computer to implement the 
above-mentioned image retrieving method is written in the medium, . 

[0020] 
5 (Action) 

In the present invention provided with the above-described 
constitution and arrangement, when the query image and the subject 
videos for retrieval that are subjected to a retrieving process are inputted, 
the feature vector of at least a part of frames included in the inputted 

10 subject videos for retrieval is firstly extracted by the frame feature vector 
extracting means, and the result of extraction is outputted as a frame 
feature vector so that it is stored in the frame feature vector storing 
means. Also, in the image feature vector extracting means, a feature 
vector of an inputted query image is extracted, and is outputted as an 

15 image feature vector. The frame feature vector stored in the frame 

feature vector storing means and the image feature vector outputted by 
the image feature vector extracting means are inputted in the similarity 
calculating means whereby the similarity of both vectors are calcidated 
therein. Then, in the frame feature vector selecting means provided in 

20 the frame feature vector integrating means, the frame feature vectors of 
which the similarities calculated by the similarity calculating means are 
higher than the predetermined value are selected out of the frame 
feature vectors stored in the frame feature vector storing means, and 
thereafter in the similar segment generating means provided in the 

25 frame feature vector integrating means, the frame feature vectors that 
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are continuous in time, within the frame feature vectors selected by the 
frame feature vector selecting means, are integrated together into one 
group and outputted. The frame featur-e-vectors integrated by the frame , 
feature vector integrating means are inputted in the similar image 
5 selecting means, and at least one frame feature vector of the highest 

similarity in the group integrated by the frame featvire vector integrating 
means is selected in the similar image selecting means. Thereafter, the 
image having the frame feature vector selected by the similar image 
selecting means is presented as a result of retrieval. 

10 [0021] 

Further, when the query video and the subject videos for retrieval 
that are inputted, a feature vector of at least a part of frames included in 
the inputted subject videos for retrieval is firstly extracted in the frame 
feature vector extracting means, and is outputted as a frame feature 

15 vector so as to be stored in the frame feature vector storing means. Also, 
in the video feature vector extracting means, a feature vector of at least a 
part of frames included in the inputted query video is extracted, and is 
outputted as a first video feature vector. Further, in the video feature 
vector cutout means, the frame feature vector corresponding to a time 

20 length that the query video inputted in the video feature vector 

extracting means has is cut out from the frame feature vectors stored in 
the frame feature vector storing means, and is outputted as a second 
video feature vector. 

The first video feature vectors outputted by the video feature 

25 vector extracting means and the second video feature vectors outputted 
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by the video feature vector cutout means are inputted into the similarity 
calculating means, so that the similarity of both are calculated in the 
similarity calculating means. Thereafter, in the video feature vector 
selecting means provided in the video feature vector integrating means, 
5 the second video feature vectors of the similarity that is calculated by the 
similarity calculating means and is higher than the predetermined 
threshold value are selected put of the second video feature vectors 
outputted by the video feature vector cutout means. Further, in the 
similar segment generating means provided in the video feature vector 

10 integrating means, the second video feature vectors that are either 

continuous or duplicate in time, out of those selected by the video feature 
vector selecting means are integrated together into one group and are 
outputted therefrom. The second video feature vectors integrated by the 
video feature vector integrating means are inputted into the similar 

15 image selecting means, and at least one second video feature vector of 
the highest similarity in the group integrated by the video feature vector 
integrating means is selected, and thereafter an image having the second 
video feature vector selected by the similar image selecting means is 
presented as a result of retrieval 

20 [0022] 

Thus, while the number of the similar images that are nominated 
and presented is suppressed, the retrieving of a similar image is 
implemented by the unit of frame. 
[0023] 

25 [Mode for carrying out the Invention] 
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The preferred embodiments of the present invention will be 
explained hereinbelow with reference to the drawings. 

[0024] _ 

(The First Embodiment) 
5 Figure 1 is a block diagram illustrating an image retrieving 

apparatus according to a first embodiment of the present invention. 
[0025] 

As shown in Fig. 1, the present embodiment includes a frame 
feature vector extracting portion 10 into which subject videos for 

10 retrieval are inputted for implementing therein extraction of a featxire 
vector of each of frames included in the inputted subject videos for 
retrieval and for outputting therefrom the extracted feature vectorsus 
frame feature vectors, a frame feature vector storing portion 20 for 
storing the frame feature vectors outputted by the frame feature vector 

15 extracting portion 10, an image feature vector extracting portion 30 into 
which a query image is inputted for implementing therein extraction of a 
feature vectors of the inputted query image and for outputting therefrom 
the extracted vectors as image leature vectors, a similarity calculating 
portion 40 for comparing the image feature vectors outputted by the 

20 image feature vector extracting portion 30 with the fame feature vectors 
stored in the frame feature vector storing portion 20 to thereby calculate 
the similarity of both vectors, a frame feature vector integrating portion 
50 for integrating the frame feature vectors of which the similarities 
calculated by the similarity calculating portion 40 satisfy a 

25 predetermined condition, out of the frame feature vectors stored in the 
frame feature vector storing portion 20, into one or a plurality of groups 
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to outpxit therefrom, and a similar image selecting portion 60 for 
selecting one or a plurality of frame feature vectors of the highest 
similarity, out of the groups of frame feature vectors outputted by the 
frame feature vector integrating portion 50, and for outputting therefrom 
5 the selected frame feature vectors. Thus, images having the frame 

feature vectors outputted by the similar image selecting portion 60 are 
outputted as a result of retrieval. 

The frame feature vector integrating portion 50 includes a frame 
feature vector selecting portion 51 for selecting the frame feature vectors 

10 of which the similarities calculated by the similarity calculating portion 
40 are equal to or larger than a predetermined value, within the frame 
feature vector's stored in the frame feature vector storing portion 20, and 
a similar segment generating portion 52 for integrating the frame featiu-e 
vectors that are continuous in time, within those selected by the frame 

15 feature vector selecting portion 51, into one group, and for outputting the 
integrated group as similar segm&nts. 
[0026] 

The description of the image retrieving method carried out by the 
image retrieving apparatus having the above-described constitution and 
20 arrangement wUl be provided hereinbelow. 
[0027] 

The videos that are subjects for retrieval are inputted into the 
frame feature vector extracting portion 10, and the query images are 
inputted into the image feature vector extracting portion 30. 
25 [0028] 
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In the frame feature vector extracting portion 10, a feature vector 
of each of frames included in the inputted subject videos for retrieval is 
extracted to be outputted therefrom as frame feature vectors. At this 
stage, the extraction of the frame feature vectors implemented by the 
5 frame feature vector extracting portion 10 is not always required to be 
implemented for all of the frames, arid the extraction of the frame feature 
vectors may be implemented, for example, at a rate of approximately 
twice per a second. 
[0029] 

10 Now, the detailed explanation of the extracting method of the 

frame feature vectors implemented by the frame feature vector 

extracting portion 10 will be provided below. — " - 

[0030] 

The extraction of the frame feature vectors in the frame feature 
15 vector extracting portion 10 may be accomplished by, for example, the 
measure disclosed in e.g. Japanese Patent Application No. 11-059432 (it 
will be hereinafter referred to as a related art) filed previously by the 
present Applicant. However, as the provision of a detailed description of 
the art disclosed in this related art wiU become cumbersome, a brief 
20 explanation of that art will be provided hereinbelow based on a concrete 
example. 
[0031] 

Now, when a certain image is inputted, the image is divided into 
8 X 8 (= 64) blocks, and then an average value is calculated with respect 
25 to each of the blocks to produce an image of a thumbnail picture (namely, 
a picture of a thumbnail size like an ixon) having 8 pixels x 8 pixels. At 
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this stage, ordinarily, since an image is usually a color image consisting 
of three primary colors of RGB, a thumbnail picture of 8 pixels x 8 pixels 
is produced for each of the three primary colors. However, for example, 
three pictures corresponding to not RGB but to three kinds of signals 
5 consisting of Y (a luminance signal), R-Y and B-Y (color difference 
signals) are produced. 
[0032] 

Subsequently, the DCT (discrete cosine transform) is applied to 
the thumbnail image to make a frequency conversion so that frequency- 
0 expressed information corresponding to the 8 x 8 pixels is obtained. 
[0033] 

Then, a low frequency component is selected from the information 
corresponding to the 8 x 8 pixels. For example, 6 components are selected 
from the Y signal, and 3 components are selected from the respective of 
the R-Y signal and the B-Y signal, and thus the total of 12 components 
are selected. Then, these 12 coefficients are roughly quantized to extract 
information of the total of 64 bits as frame feature vectors. It should here 
be noted that when the quantization of the coefacients is implemented, 
quantizing characteristic as well as quantization level numbers are 
changed, respectively, for every coefficient. As a result of the above- 
mentioned processing, the information expressed by the low frequency 
components contained in the image is obtained as frame feature vectors. 
[0034] 

The frame feature vectors outputted by the frame feature vector 
extracting portion 10 are stored in the frame feature vector storing 
portion 20. 
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[0035] 

On the other hand, in the image feature vector extracting portion 
30^ the feature vectors of the inputted query image are extracted to 
output them as image feature vectors. 
5 [0036] 

In the similarity calculating portion 40, a calc\xlation of similarity 
of the image feature vectors outputted by the image feature extracting 
portion 30 and the frame feature vectors stored in the frame feature 
vector storing portion 20 is implemented. The similarity calculation 

10 implemented by the similarity calculating portion 40 is implemented by 
the unit of frame feature vector to thereby output a similarity for each 

frame feature vector. Further, it should be understood that this 

similarity calculation could be effected at an extremely high speed by the 
method disclosed in the afore*mentioned related art and so on. 

15 [0037] 

Then, in the frame feature vector selecting portion 51 of the 
frame feature vector integrating portion 50, only the frame feature 
vectors of which the similarities calculated by the similarity calculating 
portion 40 satisfy a predetermined condition are selected out of those 

20 stored in the frame feature storing portion 20. At this stage, the above- 
mentioned predetermined condition based on which the selection of the 
frame feature vectors is implemented by the frame feature vector 
selecting portion 51 could be e.g., a condition such that only when the 
similarity of any frame feature vector calculated by the similarity 

25 calculating portion 40 exceeds a predetermined threshold value, such 
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frame feature vector is decided to be selected. Further, the threshold 
valued could be adaptively changed as reqxiired. 
[0038] 

Subsequently, in the similar segment generating portion 52, the 
5 frame feature vectors that are continuous in time are integrated together 
into one group, out of those selected by the frame feature vector selecting 
portion 51, and are outputted as a similar segment. In this case, a 
continuously existing segment can be considered as a segment in which 
the frame feature vectors selected by the frame feature vector selecting 

10 portion 51 exist continuously in time, more specifically, the continuously 
existing segment can be considered as a segment in which between the 
frame feature vector and the other frame feature vector that were 
selected by the frame feature vector selecting portion 51, there exists no 
frame feature vector that was not selected by the frame feature vector 

15 selecting portion 51. However, when two or more frame feature vectors 
do not continue in time, and when only one frame feature vector exists, 
such one frame feature vector is outputted as a similar segment. 
[0039] 

Figure 2 is a diagrammatic view illustrating how the processing 
20 is implemented by the similar segment generating portion 52 shown in 
Fig. 1. 

In Fig. 2, the abscissa is the time axis for indicating time 
positions of the respective framQ feature vectors stored in the frame 
feature vector storing portion 20, and the ordinate is an axis of similarity 
25 to indicate the similarities of the respective frame feature vectors 
calculated by the similarity calculating portion 40. 
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[0040] 

As illustrated in Fig, 2, in the similar segment generating portion 
§2, the frame feature vectors within a segment in which these frame 
feature vectors selected by the frame featxire vector selecting portion 51 
5 exist continuously in time, are integrated together into one group, and 
are outputted as a similar segment. 
[0041] 

Thereafter, one or a plurality of frame feature vectors of the 
highest similarity within the similar segments outputted by the similar 
10 segment generating portion 52 are selected by the similar image 
selecting portion 60, and images having the frame feature vectors 
selected by the similar image selecting portion 60 are presented as a 
result of retrieval. 
[0042] 

15 In the above "described embodiment, since all of the frame feature 

vectors extracted from the subject videos for retrieval are collated with 
the image feature vectors of the query image, the retrieving of similar 
images can be implemented by the unit of frame. Furthermore, instead of 
proposing all similar fames, only the similar frames that exist 

20 continuously in time are integrated into one group, and some images of 
the" highest similarity within the respective of the groups are selected for 
proposal. Therefore, the retrieving of an image can be achieved while 
controlling or suppressing the number of similar images to be nominated. 
[0043] 

25 Further, it is possible to specify a specific one in the scenes by the 

unit of frame. Thus, when similar images to the query images are 

I 
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included in a program, even if the similar image does not appear in the 
introduction of the program, it is possible to accxirately obtain a cue in 
the program. 
[0044] 

5 (The Second Embodiment) 

Figure 3 is a block diagram illustrating an image retrieving 
apparatus according to a second embodiment of the present invention. 
[0045] 

As illustrated in Fig. 3. the present embodiment includes a frame 
10 feature vector extracting portion 10 into which subject videos for 

retrieval are inputted for implementing therein extraction of a feature 
vector of each of frames included in the inputted subject videos for 
retrieval and for outputting therefrom the extracted feature vectors as 
frame feature vectors, a frame feature vector storing portion 20 for 

15 storing the frame featiure vectors outputted by the frame feature vector 
extracting portion 10, a video feature vector extracting portion 130 into 
which a query video is inputted for implementing therein extraction of a 
feature vectors of the inputted query video and for outputting therefrom 
the extracted vectors as video feature vectors, a video feature vector 

20 cutout portion 170 for cutting out the feature vectors that may 

correspond to a time length that the query videos inputted into the video 
feature vector extracting portion 130 have, out of the frame feature 
vectors stored in the frame feature vector storing portion 20, and for 
outputting the cutout frame feature vectors as video feature vectors, a 

25 similarity calculating portion 140 for comparing the video feature vectors 
outputted by the video feature vector extracting portion 130 with the 
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video feature vectors outputted by the video feature vector cutout portion 
170 to calculate the similarity of both vectors, a video feature vector 
integrating portion 150 for integrating the video feature vectors of which 
the similarities calculated by the similarity calculating portion 140 
5 satisfy a predetermined condition, out of the video feature vectors 
outputted by the video feature vector cutout portion 170, into one or a 
plurality of groups to thereby output the integrated groups, and a simUar 
video selecting portion 160 for selecting one or a plurahty of video feature 
vectors of the highest similarity, out of the groups of video feature 
10 vectors outputted by the video feature vector integrating portion 150 to 
thereby output the selected video feature vectors. Thus, videos having 
the video feature vectors outputted by the similar video selecting portion 
160 are outputted as a result of retrieval. Also, the video feature vector 
integrating portion 150 is constituted by a video feature vector selecting 
15 portion 151 for selecting the video feature vectors of which the 

similarities calculated by the similarity calculating portion 140 are equal 
to or larger than a predetermined value, out of the video feature vectors 
outputted by the video feature vector cutout portion 170, and a sunilar 
segment generating portion 152 for integrating the video feature vectors 
20 that are either continuous or partially duplicate in time, out of those 

selected by the video feature vector selecting portion 151, into one group 
to thereby output the integrated group as a similar segment. 
[0046] 

The description of the image retrieving method implemented by 
25 the image retrieving apparatus having the above-described constitution 
and arrangement will be provided hereinbelow. 
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[0047] 

The videos that are subjects for retrieval are inputted intx) the 
frame feature vector extracting portion 10, and the query videos are 
inputted into the video feature vector extracting portion 130. 
5 [0048] 

The frame feature vector extracting portion 10 extracts the 
feature vectors of the respective frames included in the inputted subject 
videos for retrieval to output the extracted feature vectors as frame 
feature vectors. At this stage, as the method of extracting the frame 
10 feature vectors implemented by the frame feature vector extracting 

portion 10, the method described in connection with the first embodiment 
could be used, 
[0049] 

The frame feature vectors outputted by the frame feature vector 
15 extracting portion 10 are stored in the frame feature vector storing 
portion 20. 
[0050] 

In the video feature vector cutout portion 170, the frame feature 
vectors corresponding to a time length that the query videos inputted 
20 into the video feature vector extracting portion 130 have are cut out of 
the frame feature vectors stored in the frame feature vector storing 
portion 20, and are outputted as video feature vectors. 
[0051] 

In the video feature vector extracting portion 130, the feature 
25 vectors of the inputted query videos are extracted and are outputted as 
video feature vectors. 
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[0052] 

In the similarity calculating portion 140, the similarity between 
the video feature vectors outputted by the video feature vector extracting 
portion 130 and the video feature vectors outputted by the video feature 
5 vector cutout portion 170 is calculated At this stage, the similarity 

calculation in the similarity calculating portion 140 is implemented in a 
ma:uner such that a similarity is calculated by the unit of each of the 
frame feature vectors that are included in the video feature vectors 
outputted by both the video feature vector extracting portion 130 and the 

10 video feature vector cutout portion 170, and then the sum of similarities 
of the respective frame featxire vectors is calcidated. Further, this 
similarity calculation can be achieved at an extremely high speed by the 
using of the method disclosed in the afore -mentioned related art. 
Furthermore, the similarity calculated by the similarity calculating 

15 portion 140 may be outputted as not only the described sum of 

similarities for the respective frame feature vectors but also an average 
value, a median, and a mode. 
[0053] 

Thereafter, in the video feature vector selecting portion 151 in the 
20 video feature vector integrating portion 150, only the video feature 

vectors of which the similarities calcxilated by the similarity calculating 
portion 140 can satisfy a predetermined condition are selected out of 
those outputted by the video feature vector cutout portion 170, At this 
stage, the above-mentioned predetermined condition based on which the 
25 video feature vector selecting portion 151 selects the video feature 
vectors could be a condition such that only the video feature vectors 
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sho\ild be selected when the similarities thereof calculated by the 
similarity calculating portion 140 could exceed a predetermined 
threshold value. Also, the predetermined threshold value can be 
adaptively varied as required. 
5 [0054] 

Subsequently, in the similar segment generating portion 152, the 
video feature vectors that are either continuous or partly dupHcate in 
time, out of those selected by the video feature vector selecting portion 
151 are integrated together into one group to be outputted as a similar 
10 segment. 

[0055] 

Figiire 4 is a diagrammatic view illustrating the processing 
implemented by the similar segment generating portion 152 shown in 
Fig, 3. 
15 [0056] 

As shown in Fig. 4, when the video feature vectors outputted by 
the video feature vector cutout portion 170 exist continuously, a segment 
in which the video feature vectors exist continuously are assembled 
together to generate a similar segment, 
20 [0057] 

Then, in the similar video selecting portion 160, one or a plurality 
of video feature vectors of the highest similarity within the similar 
segment outputted by the similar segment generating portion 152 are 
selected. Further, the images that have the video feature vectors selected 
25 by the similar video selecting portion 160 are presented as a result of 
retrieval. 
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[0058] 

In the above-described embodiment, since all of the frame feature 
vectors extracted from the videos that are subjects for retrieval are 
collated with the video feature vectors of the query videos, the retrieving 
5 of the similar video segments can be implemented by the unit of frame. 
Furthermore, instead of proposing all of the similar segments, segments 
in which the similar video feature vectors exist continuously are 
integrated into respective one of the groups, and some videos of the 
highest similarities in the respective groups are selected. Accordingly, 
10 the retrieving of the videos can be realized while controlling or 
suppressing the number of similar videos to be nominated. 
- [0059] 

Further, in the present embodiment, the opening of a specific 
program and a common source for the news can be surely retrieved 

15 without shifting of the start position. Also, when, for example, a given 
CM is inputted as a query video, the number of broadcastings and the 
time zones for broadcastings can be accurately acquainted by the unit of 
frame. Moreover, if, for example, a highlight scene of a soccer game is 
imputed as a query video, it is possible to adaptively implement such a 

20 retrieval that the same or similar scene is detected from a relay 

broadcasting of the soccer as a similar video segment. Then, a very 
similar video although not the same content of video can be obtained. 
[0060] 

While the above-described two embodiments are preferred forms 
25 of the present invention, the present invention is not intended to be 
limited thereto, and various changes and modifications will occur to 
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those skilled in the art without departing from the spirit of the present 
invention. 
10061] 

Further, in the above-described image retrieving method, a 
5 program permitting a computer to implement that image retrieving 
method may be recorded in a recording medium such as an EPROM (an 
erasable PROM) so as to be widely used. 
[0062] 

[Effect of the Invention] 

10 As described in the foregoing, according to the image retrieving 

apparatus of the present invention, query images are collated with all of 
the frame feature vectors extracted from videos that are subjects for 
retrieval, and therefore the retrieving of the similar images may be 
implemented by the unit of frame. Also, instead of proposing all of the 

15 similar frames as a result of retrieval, segments in which similar frames 
exist continuously are formed in at least one group, and some images 
having the highest similarities are selected out of the respective groups 
in order to retrieve similar images. Accordingly, the retrieving of the 
imagea can be reaHzed with the suppressed- number of nomination of 

20 similar images. 
[0063] 

Further, since the query videos are collated with all of the frame 
feature vectors extracted from the videos that are subjects for retrieval, 
the retrieving of similar video segments may be implemented by the unit 
25 of frame. Furthermore, instead of proposing all similar segments, 
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segments in which similar video feature vectors exist continuously are 
respectively formed in at least one group, and some most similar videos 
are selected from the respective segments to retrieve the similar images. 
Therefore, the retrieving of the videos can be realized with the 
5 suppressed number nomination of similar videos. 

[Brief explanation of the Drawings] 
[Fig. 1] 

Fig. 1 is a block diagram illustrating an image retrieving 
apparatus according to a first embodiment of the present invention; 
10 [Fig, 2] 

Fig, 2 is a diagrammatic view used for explaining the processing 
implemented in the similar segment generating portion shown in Fig. l; 

[Fig. 3] 

Fig. 3 is a block diagram illustrating an image retrieving 
15 apparatus according to a second embodiment of the present invention; 
and, 

[Fig. 4] 

Fig. 4 is a diagrammatic view used for explaining the processing 
implemented in the similar segment generating portion shown in Fig. 3, 

20 [Explanation of numerals] 

10 Frame feature vector extracting portion 
20 Frame feature vector storing portion 
30 Image feature vector extracting portion 
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40, 140 Similarity calculating portion 

50 Frame feature vector integrating portion 

51 Frame feature vector selecting portion 
52, 152 Similar segment generating portion 

5 60 Similar image selecting portion 

130 Video feature vector extracting portion 

150 Video feature vector integrating portion 

151 Video feature vector selecting portion 
160 Similar video selecting portion 

10 170 Video feature vector cutout portion 
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[Name of the Document] Abstract 
[Abstract] 

[Problem to be solved] 

Retrieving an image in which the number of similar images 
5 nominated and presented is controlled while implementing the retrieving 
of the similar images by the unit of frame. 

[Means for solving Problem] 

In the similarity calculating portion 40, a similarity 
between frame feature vectors of the subject videos for retrieval, stored 

10 in the frame feature vector storing portion 20, and image feature vectors 
of the query images extracted by the image feature vector extracting 
portion 30 is calculated, further in the frame feature vector selecting 
portion 51, the frame feature vectors of which the similarities are higher 
than a predetermined threshold value are selected, and furthermore, in 

15 the similar segment generating portion 52, the frame featxire vectors that 
are continuous in time, out of those selected by the frame feature vector 
selecting portion 51, are integrated into one group. Thereafter, in the 
similar image selecting portion 60, at least one frame feature vector of 
the highest similarity inside the integrated group is selected so as to 

20 present an image having the selected frame feature vector as a result of 
retrieval. 

[Chosen Drawing] Fig.l 
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